ptstorage, *: deprecate storage.ex with txn-bound internal executor #91404

ZhouXing19 · 2022-11-07T15:38:05Z

This is part of the effort to reduce the usage of free-floating internal executors and use those bound to the outer txn.

Most of the changes in this PR are mechanical. The one that may interest you is the last commit that passes the outer txn
from the internal executor to the conn executor's planner. Without it, it will fail when creating a job while committing the txn.

Informs: #91004

cockroach-teamcity · 2022-11-07T15:38:25Z

This change is

So that we can create an internal executor bound to the txn. Release note: None

….GetState() Internal executor should always bound to the txn. They should be created and passed to functions together. Release note: None

….GetMetadata Release note: None

…etRecord() This is part of the effort to reduce free floating internal executor, and use those bound to the outer txn. Release note: None

…UpdateTimestamp() This is part of the effort to reduce free floating internal executor, and use those bound to the outer txn. Release note: None

…rkVerified() IE means sqlutil.InternalExecutor. This is part of the effort to reduce free floating internal executor, and use those bound to the outer txn. Release note: None

…rotect() Release note: None

Release note: None

ajwerner

That's some serious plumbing. The fatalist part of me is inclined to accept it, because it moves on a goal, and, hey we can always clean things up later. The question I feel required to ask is whether this is going to make such future cleanup all the more worse.
This is a change which I understand the motivation for, but is causing me anxiety because of the sprawl it is forced to contend with. At its core, the cruft feels like it's coming from a failure to separate dependencies from argument in interfaces forcing dependency injection to flow through many layers.

I wonder if we might been better off re-thinking the API which took a *kv.Txn to make them indirect through an object to hold the transaction or what not.

For example, if today we have (*jobs.Job).Update imagine if we had:

package jobs

func (j *Job) WithTxn(txn *kv.Txn, ie sqlutil.InternalExecutor) Updater {
   return Updater{j: j, txn: txn, ie: ie}
}

func (j *Job) NoTxn() Updater {
   return Updater{j: j}
}

func (u Updater) Update(fn UpdateFn) error {
   // ...
}

Then we could hang a whole lot of other methods off this structure too. I think the same pattern could be applied elsewhere. Consider the protected timestamp stuff. What if that had layers such that there was an interface which just took arguments and then mechanisms to construct those interfaces either with no txn, meaning the methods would create their own transactions, or with the txn and ie.

separately, when this is going to merge, squash the commits.

Reviewed 6 of 6 files at r1, 3 of 3 files at r2, 4 of 4 files at r3, 12 of 12 files at r4, 5 of 5 files at r5, 27 of 27 files at r6, 7 of 7 files at r7, 105 of 105 files at r8, 19 of 23 files at r9.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy, @rafiss, @renatolabs, @rhu713, @smg260, and @ZhouXing19)

pkg/ccl/changefeedccl/alter_changefeed_stmt.go line 100 at r8 (raw file):

		var job *jobs.Job
		if err := p.WithInternalExecutor(ctx, func(ctx context.Context, txn *kv.Txn, ie sqlutil.InternalExecutor) error {
			job, err = p.ExecCfg().JobRegistry.LoadJobWithTxn(ctx, jobID, p.Txn(), ie)

I feel confused by a lot of this. Why are we using p.WithInternalExecutor if the JobRegistry has these dependencies already? I'm starting to form a view that ambiguity about whether at Txn/IE combo is allowed to be nil is a problem. If it is allowed to be nil, why are we created a txn here? If it's not then that's its own problem.

pkg/jobs/scheduled_job_executor.go line 30 at r8 (raw file):

	// Executes scheduled job;  Implementation may use provided transaction.
	// Modifications to the ScheduledJob object will be persisted.
	ExecuteJob(ctx context.Context, cfg *scheduledjobs.JobExecutionConfig, env scheduledjobs.JobSchedulerEnv, schedule *ScheduledJob, txn *kv.Txn, ie sqlutil.InternalExecutor) error

nit: rewrap this

pkg/jobs/scheduled_job_executor.go line 68 at r8 (raw file):

	// SCHEDULE` query. OnDrop may drop the schedule's dependent schedules and will
	// return the number of additional schedules it drops.
	OnDrop(ctx context.Context, scheduleControllerEnv scheduledjobs.ScheduleControllerEnv, env scheduledjobs.JobSchedulerEnv, schedule *ScheduledJob, txn *kv.Txn, descsCol *descs.Collection, ie sqlutil.InternalExecutor) (int, error)

nit: rewrap this

pkg/sql/job_exec_context.go line 121 at r8 (raw file):

		ctx context.Context,
		run func(ctx context.Context, txn *kv.Txn, ie sqlutil.InternalExecutor) error,
	) error

if we have an ExecCfg, why this?

Code quote:

	WithInternalExecutor(
		ctx context.Context,
		run func(ctx context.Context, txn *kv.Txn, ie sqlutil.InternalExecutor) error,
	) error

pkg/sql/planhook.go line 131 at r6 (raw file):

	BufferClientNotice(ctx context.Context, notice pgnotice.Notice)
	Txn() *kv.Txn
	WithInternalExecutor(

do we really need this given we have the internal executor factory already in the ExecCfg?

Code quote:

		ctx context.Context,
		run func(ctx context.Context, txn *kv.Txn, ie sqlutil.InternalExecutor) error,
	) error

pkg/sql/schemachanger/scdeps/exec_deps.go line 48 at r8 (raw file):

	MakeJobID() jobspb.JobID
	CreateJobWithTxn(ctx context.Context, record jobs.Record, jobID jobspb.JobID, txn *kv.Txn) (*jobs.Job, error)
	UpdateJobWithTxn(ctx context.Context, jobID jobspb.JobID, txn *kv.Txn, ie sqlutil.InternalExecutor, useReadLock bool, updateFunc jobs.UpdateFn) error

nit: wrap this

ZhouXing19 · 2022-11-15T19:28:32Z

I wonder if we might been better off re-thinking the API which took a *kv.Txn to make them indirect through an object to hold the transaction or what not.

I agree -- passing the ie along the txn through layers is torturing :)
I tried modified structs that might match what you mentioned:

// SQLTranslator is the concrete implementation of spanconfig.SQLTranslator.
type SQLTranslator struct {
	ptsProvider protectedts.Provider
	codec       keys.SQLCodec
	knobs       *spanconfig.TestingKnobs
	txnBundle txnBundle
}

// txnBundle is created to emphasize that the SQL translator is correspond to
// a certain txn, and all fields here are a whole. It essentially keeps the
// semantics of “translate at a snapshot in time”. This means that this
// txnBundle should be written only in `NewTranslator`.
type txnBundle struct {
	txn      *kv.Txn
	descsCol *descs.Collection
	ie sqlutil.InternalExecutor
}

Do you think this is what you would prefer?

ajwerner · 2022-11-15T19:59:32Z

That commit itself is fine, but I don't really understand the exported getters. I think the next step would then be to give the protected timestamp interfaces a similar treatment such that when interacting with protected timestamps you don't think about which txn you're using either.

ZhouXing19 · 2022-11-15T20:21:24Z

Is this roughly close to what you were thinking? ZhouXing19@4de105e#diff-25c1841c99ae0297b18c357c13b913e454b2cafb70cc7fa557326dc11a1ff892R49

ajwerner · 2022-11-15T21:21:16Z

Not quite what I had in mind. I was not expecting there to be any setters or any exported logic that is aware of whether anything was set.

ajwerner · 2022-11-15T21:22:19Z

I guess it is like what I had in mind in that you removed the Txn argument from GetMetadata

ajwerner · 2022-11-15T21:22:39Z

Imagine if the Txn argument was removed from all of the methods in that interface.

rafiss · 2022-11-15T22:10:29Z

I wonder where we should limit the scope of this refactor / updating internalExecutor usages. The redesign being proposed here seems like something that would be better suited for the owners of the protectedts code. Part of the reason I was advocating for an earlier announcement of the new internal executor interfaces was so that the work to use the new interfaces and discover opportunities to improve interfaces would come to the teams that own the various parts of the code that use the internal executor.

It does not seem tenable to have Jane (or more generally, SQL engineers) make somewhat substantial refactors in any area of the code just because it so happens to touch the internal executor. I recall earlier when Jane pushed back against having to make the updates for all the internal executor usages because it would be a giant undertaking, the response was that it was a lot of changes, but all mechanical, and mechanical refactors are good. It seems to me that we're going further away from just mechanical refactors.

Concretely: I would like Jane's project to not continue to balloon in scope too much. I feel that a way to achieve that is to keep the refactoring here mechanical, and leave any deeper questions of how the protectedts package should plumb different objects up to the code owners. Thoughts?

ajwerner · 2022-11-15T23:04:26Z

I feel that a way to achieve that is to keep the refactoring here mechanical, and leave any deeper questions of how the protectedts package should plumb different objects up to the code owners. Thoughts?

I buy the sentiment. It was where I started when I first saw the PR. I was going to write that you have my stamp. I have a harder time imagining that unnamed codeowners are going to come around and do anything unless there are names and deadlines.

What really got me to rethink here was that it feels to me like the mechanical refactors could be made smaller with some initial refactors elsewhere. These nominally mechanical changes do introduce new txn loops right-and-left and then require capturing things in closure scope that dilutes the intention of the code. It seems pretty clear how to rework these interfaces to avoid the fundamental flaws and if we don't do it, this codebase is reaching the point where we'll never do them.

I think that if we want to ease the burden, what we should do is create two tasks:

Move the optional txn from the sprawling set of methods on the *jobs.Job object to an intermediate structure
Remove the optional txn from the sprawing set of methods on the protectedts.Storage interface out to an intermediate structure

Then say that progress is blocked on that. I'm interested in helping with that. It's hard for me to find the time to do that in the next week.

I reference fatalism above because it's a feeling that I get when thinking about these various choices. I want to at least have some courage to ask us to understand more deeply why this refactor hurts and what other future refactor we might need to do will hurt just as much, or more because we're not attacking the root cause of the inflexibility. What's going on here is to couple, philosophically, sql transactional concepts with other parts of the sql subsystem like via the internal executor.

Having written the above statement, perhaps what we ought to do is to make a new struct which can hold these two objects. That might make this change more mechanical. I still don't feel like it's better than decomposing the dependencies from the arguments, but it'd be something.

srosenberg · 2022-11-16T00:14:34Z

pkg/kv/kvserver/protectedts/ptstorage/storage.go

@@ -264,7 +264,9 @@ func (p *storage) GetRecord(
 	return &r, nil
 }

-func (p *storage) MarkVerified(ctx context.Context, txn *kv.Txn, id uuid.UUID) error {
+func (p *storage) MarkVerified(
+	ctx context.Context, txn *kv.Txn, executor sqlutil.InternalExecutor, id uuid.UUID,


Is executor intended to replace p.ex a few lines below? Below (in Release) the same arg. is called ie.

As a person unfamiliar with this code, I find the new interface utterly confusing. Can p.ex be injected directly with the correct InternalExecutor across method boundaries as opposed to threading it? The fact that both executors are now in the same lexical scope makes it error-prone.

Is executor intended to replace p.ex a few lines below?

Yes, and it replaced the p.ex in a later commit. Eventually, this PR removed all usages of p.ex, and removed it.

Thanks Jane, I didn't realize I was looking at an intermediate commit. Feel free to ignore my follow-up comment, but I am still curious why we're threading the InternalExecutor dependency instead of embedding.

Can p.ex be injected directly with the correct InternalExecutor across method boundaries as opposed to threading it?

I believe this is what Andrew also suggested here, so I replied below.

ZhouXing19 · 2022-11-16T15:07:27Z

I think that if we want to ease the burden, what we should do is create two tasks:

Move the optional txn from the sprawling set of methods on the *jobs.Job object to an intermediate structure

Remove the optional txn from the sprawing set of methods on the protectedts.Storage interface out to an intermediate structure

Maybe we can just replace the txn in those methods' argument with sqlutil.InternalExecutor?

For the 2nd part, I think we can remove the txn from the arguments of those methods, as the txn is only used for internal executor query functions. What about we have a ie.GetTxn() as a (temporary) replacement in these query functions?

I.e. can we have

func (p *storage) UpdateTimestamp(
	ctx context.Context,  id uuid.UUID, timestamp hlc.Timestamp, ie sqlutil.InternalExecutor,
) error {
	row, err := ie.QueryRowEx(ctx, "protectedts-update", ie.GetTxn(),
		sessiondata.InternalExecutorOverride{User: username.NodeUserName()},
		updateTimestampQuery, id.GetBytesMut(), timestamp.WithSynthetic(false).AsOfSystemTime())
	if err != nil {
		return errors.Wrapf(err, "failed to update record %v", id)
	}
	if len(row) == 0 {
		return protectedts.ErrNotExists
	}
	return nil
}

For the 1st part I think it's the same, but a bit more implicit. I looked at all usages of Job.Update() and most of txn arguments are useless, or they got passed to methods in protectedts.Storage. So similarly, we can just have

func (j *Job) Update(ctx context.Context, ie sqlutil.InternalExecutor, updateFn UpdateFn) error {
	const useReadLock = false
	return j.update(ctx, ie, useReadLock, updateFn)
}

If we want to do this, it seems still pretty mechanical to me and I don't think intermediate structures are needed anymore.
But still, this means that we're still refactoring stuff through the usages of methods/functions. It just makes it look cleaner , but the number of changes, I assume, is pretty much the same.

ZhouXing19 · 2022-11-16T15:32:40Z

Maybe we can just replace the txn in those methods' argument with sqlutil.InternalExecutor?

Continuing on this but on a higher level, do you think it makes sense that txn should always be encompassed by an internal executor, rather than being passed in parallel with an internal executor to methods?

ZhouXing19 · 2022-11-16T16:03:51Z

I wonder if we might been better off re-thinking the API which took a *kv.Txn to make them indirect through an object to hold the transaction or what not.

Can p.ex be injected directly with the correct InternalExecutor across method boundaries as opposed to threading it?

Given that txn might not be needed for Job.Update() and methods of protectedts.Storage, can we do this:

// Job manages logging the progress of long-running system processes, like
// backups and restores, to the system.jobs table.
type Job struct {
	// TODO(benesch): avoid giving Job a reference to Registry. This will likely
	// require inverting control: rather than having the worker call Created,
	// Started, etc., have Registry call a setupFn and a workFn as appropriate.
	registry *Registry
	ex *sqlutil.InternalExecutor

	id        jobspb.JobID
	createdBy *CreatedByInfo
	session   sqlliveness.Session
	mu        struct {
		syncutil.Mutex
		payload  jobspb.Payload
		progress jobspb.Progress
		status   Status
		runStats *RunStats
	}
}

func (j *Job) WithTxn(txnBoundInternalExecutor sqlutil.InternalExecutor, f func() error) error {
	j.ex = txnBoundInternalExecutor
	defer func() { j.ex = nil }()
	return f()
}

// storage interacts with the durable state of the protectedts subsystem.
type storage struct {
	settings *cluster.Settings

	knobs *protectedts.TestingKnobs
	ie sqlutil.InternalExecutor
}

func (s *storage) WithTxn(txnBoundInternalExecutor sqlutil.InternalExecutor, f func()) {
	s.ie = txnBoundInternalExecutor
	defer func() { s.ie = nil }()
	f()
}

So that the internal executor hangs off Job or storage only in a certain closure, to avoid getting misused under a txn that it's not bound to.

ZhouXing19 · 2022-11-22T14:56:51Z

Per an offline convo with @ajwerner we've decided to pause on this work. We need refactoring of the internal executor interface and related functions in sqlutil. Essentially, there needs to be a better binding of *kv.Txn and sqlutil.InternalExecutor, and fewer functions with closures.
I'll go back to this PR and the remaining migration after the refactoring is done.

ajwerner · 2022-11-22T15:01:02Z

I've been hacking on it in the background. It'll take a little more focused time, but I think the end state that's coming together is starting to feel good. Thanks for kicking off the conversation with this.

ZhouXing19 · 2023-01-23T21:01:58Z

Closing this as it has been done by #93218.

ZhouXing19 force-pushed the migrate-ie-1103 branch 5 times, most recently from a616ba6 to d9e1ba9 Compare November 9, 2022 22:12

ZhouXing19 mentioned this pull request Nov 9, 2022

storage: add ie to storage.GetState and GetMetadata #91409

Closed

ZhouXing19 force-pushed the migrate-ie-1103 branch 2 times, most recently from 994c976 to a7dfb4b Compare November 15, 2022 15:00

ZhouXing19 added 10 commits November 15, 2022 10:13

ptcache: add sqlutil.InternalExecutorFactory field for Cache and Config

7cea4e7

So that we can create an internal executor bound to the txn. Release note: None

ptstorage: add sqlutil.InternalExecutorFactory to storageWithDatabase

87812c3

So that we can create an internal executor bound to the txn. Release note: None

ptreconciler: add sqlutil.InternalExecutorFactory to Reconciler

db91524

So that we can create an internal executor bound to the txn. Release note: None

ptstorage, *: add InternalExecutor to the parameter list of pgStorage…

b992a4b

….GetState() Internal executor should always bound to the txn. They should be created and passed to functions together. Release note: None

ptstorage, *: add InternalExecutor to the parameter list of ptstorage…

93582e7

….GetMetadata Release note: None

ptstorage, *: add InternalExecutor to the parameter list of storage.G…

3b19e9b

…etRecord() This is part of the effort to reduce free floating internal executor, and use those bound to the outer txn. Release note: None

ptstorage, * : add InternalExecutor to the parameter list of storage.…

29351d9

…UpdateTimestamp() This is part of the effort to reduce free floating internal executor, and use those bound to the outer txn. Release note: None

job, ptstorage: add IE to job.Update(), storage.Release(), storage.Ma…

1ad61a1

…rkVerified() IE means sqlutil.InternalExecutor. This is part of the effort to reduce free floating internal executor, and use those bound to the outer txn. Release note: None

ptstorage, *: add InternalExecutor to the parameter list of storage.P…

298e301

…rotect() Release note: None

zipper: add ieNotBoundToTxn to InternalInflightTraceZipper

a00f7f5

Release note: None

ZhouXing19 changed the title ~~storage, job: migrate usages of internal executor~~ ptstorage, *: deprecate storage.ex with txn-bound internal executor Nov 15, 2022

ZhouXing19 force-pushed the migrate-ie-1103 branch from a7dfb4b to a00f7f5 Compare November 15, 2022 15:31

sql: pass txn from internal executor to conn executor's planner

19b2f20

ZhouXing19 marked this pull request as ready for review November 15, 2022 18:14

ZhouXing19 requested a review from a team November 15, 2022 18:14

ZhouXing19 requested review from a team as code owners November 15, 2022 18:14

ZhouXing19 requested a review from a team November 15, 2022 18:14

ZhouXing19 requested review from a team as code owners November 15, 2022 18:14

ZhouXing19 requested review from herkolategan, srosenberg, ajwerner, a team, smg260, renatolabs and rafiss and removed request for a team, herkolategan and srosenberg November 15, 2022 18:14

ajwerner reviewed Nov 15, 2022

View reviewed changes

srosenberg reviewed Nov 16, 2022

View reviewed changes

dhartunian removed the request for review from a team November 21, 2022 16:18

ZhouXing19 closed this Jan 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ptstorage, *: deprecate storage.ex with txn-bound internal executor #91404

ptstorage, *: deprecate storage.ex with txn-bound internal executor #91404

ZhouXing19 commented Nov 7, 2022 •

edited

Loading

cockroach-teamcity commented Nov 7, 2022

ajwerner left a comment

ZhouXing19 commented Nov 15, 2022

ajwerner commented Nov 15, 2022

ZhouXing19 commented Nov 15, 2022

ajwerner commented Nov 15, 2022

ajwerner commented Nov 15, 2022

ajwerner commented Nov 15, 2022

rafiss commented Nov 15, 2022

ajwerner commented Nov 15, 2022

srosenberg Nov 16, 2022

ZhouXing19 Nov 16, 2022

srosenberg Nov 16, 2022

ZhouXing19 Nov 16, 2022

ZhouXing19 commented Nov 16, 2022 •

edited

Loading

ZhouXing19 commented Nov 16, 2022

ZhouXing19 commented Nov 16, 2022 •

edited

Loading

ZhouXing19 commented Nov 22, 2022

ajwerner commented Nov 22, 2022

ZhouXing19 commented Jan 23, 2023

ptstorage, *: deprecate storage.ex with txn-bound internal executor #91404

ptstorage, *: deprecate storage.ex with txn-bound internal executor #91404

Conversation

ZhouXing19 commented Nov 7, 2022 • edited Loading

cockroach-teamcity commented Nov 7, 2022

ajwerner left a comment

Choose a reason for hiding this comment

ZhouXing19 commented Nov 15, 2022

ajwerner commented Nov 15, 2022

ZhouXing19 commented Nov 15, 2022

ajwerner commented Nov 15, 2022

ajwerner commented Nov 15, 2022

ajwerner commented Nov 15, 2022

rafiss commented Nov 15, 2022

ajwerner commented Nov 15, 2022

srosenberg Nov 16, 2022

Choose a reason for hiding this comment

ZhouXing19 Nov 16, 2022

Choose a reason for hiding this comment

srosenberg Nov 16, 2022

Choose a reason for hiding this comment

ZhouXing19 Nov 16, 2022

Choose a reason for hiding this comment

ZhouXing19 commented Nov 16, 2022 • edited Loading

ZhouXing19 commented Nov 16, 2022

ZhouXing19 commented Nov 16, 2022 • edited Loading

ZhouXing19 commented Nov 22, 2022

ajwerner commented Nov 22, 2022

ZhouXing19 commented Jan 23, 2023

ZhouXing19 commented Nov 7, 2022 •

edited

Loading

ZhouXing19 commented Nov 16, 2022 •

edited

Loading

ZhouXing19 commented Nov 16, 2022 •

edited

Loading